Probabilistic Context-free Grammars in Natural Language Processing
نویسنده
چکیده
Context-free grammars (CFGs) are a class of formal grammars that have found numerous applications in modeling computer languages. A probabilistic form of CFG, the probabilistic CFG (PCFG), has also been successfully applied to model natural languages. In this paper, we discuss the use of PCFGs in natural language modeling. We develop PCFGs as a natural extension of the CFGs and explain one probabilistic parser for PCFGs in detail. We also outline two methods that are used for estimating the PCFG probabilities. Finally, we state the limitations of PCFGs and mention how it can be augmented for modeling natural languages better.
منابع مشابه
Studying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملProbabilistic Grammars and Hierarchical Dirichlet Processes
Probabilistic context-free grammars (PCFGs) have played an important role in the modeling of syntax in natural language processing and other applications, but choosing the proper model complexity is often difficult. We present a nonparametric Bayesian generalization of the PCFG based on the hierarchical Dirichlet process (HDP). In our HDP-PCFG model, the effective complexity of the grammar can ...
متن کاملOn the Computation of Distances for Probabilistic Context-Free Grammars
Probabilistic context-free grammars (PCFGs) are used to define distributions over strings, and are powerful modelling tools in a number of areas, including natural language processing, software engineering, model checking, bio-informatics, and pattern recognition. A common important question is that of comparing the distributions generated or modelled by these grammars: this is done through che...
متن کاملGrammatical Inference of some Probabilistic Context-Free Grammars from Positive Data using Minimum Satisfiability
Recently, different theoretical learning results have been found for a variety of context-free grammar subclasses through the use of distributional learning (Clark, 2010b). However, these results are still not extended to probabilistic grammars. In this work, we give a practical algorithm, with some proven properties, that learns a subclass of probabilistic grammars from positive data. A minimu...
متن کاملTreebank-Based Probabilistic Phrase Structure Parsing
The area of probabilistic phrase structure parsing has been a central and active field in computational linguistics. Stochastic methods in natural language processing, in general, have become very popular as more and more resources become available. One of the main advantages of probabilistic parsing is in disambiguation: it is useful for a parsing system to return a ranked list of potential sy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005